FILTER MODE ACTIVE

#large language models

Records found: 139

#large language models11/09/2025

CFOs Partnering with Generative AI: From Routine Tasks to Strategic Impact

'Generative AI is helping CFOs shed routine work and focus on strategy, with early adoption in reporting, treasury, and investor communications.'

READ →

#large language models25/08/2025

GPUs vs TPUs in 2025: Which Accelerator Wins for Training Massive Transformer Models?

'A practical comparison of GPUs and TPUs for training large transformer models in 2025, highlighting top models like TPU v5p and NVIDIA Blackwell B200 and when to pick each accelerator.'

READ →

#large language models09/08/2025

Mixture-of-Agents: How Collective LLM Teams Outperform Monolithic Models

'Mixture-of-Agents (MoA) arranges specialized LLM agents in layered pipelines to produce more accurate and interpretable results on multi-step tasks, outperforming single monolithic models on benchmarks.'

READ →

#large language models05/08/2025

Anthropic AI Develops Persona Vectors to Tackle Personality Shifts in Large Language Models

Anthropic AI proposes a novel method using persona vectors to detect and control personality shifts in large language models, enhancing their reliability and safety.

READ →

#large language models05/08/2025

OpenAI Unveils First Open-Weight Language Models Since GPT-2

OpenAI has released its first open-weight large language models since GPT-2, offering downloadable models under a permissive license that support customization and local use, marking a strategic move in AI research and geopolitics.

READ →

#large language models03/08/2025

Unlocking the Future of AI: A Comprehensive Guide to Context Engineering in Large Language Models

Discover how context engineering advances large language models beyond prompt engineering with innovative techniques, system architectures, and future research directions.

READ →

#large language models01/08/2025

Training LLMs with 'Evil' Patterns Can Surprisingly Make Them Safer

Anthropic's new research reveals that activating 'evil' behavior patterns during training can prevent large language models from adopting harmful traits, improving safety without compromising performance.

READ →

#large language models01/08/2025

Falcon-H1: A Groundbreaking Hybrid Model Challenging 70B Parameter Giants

Falcon-H1 from TII introduces a hybrid model combining attention and state space mechanisms, achieving performance on par with leading 70B parameter LLMs while optimizing efficiency and scalability.

READ →

#large language models01/08/2025

SmallThinker: Breakthrough Efficient LLMs Designed for Local Devices

'SmallThinker introduces a family of efficient large language models specifically designed for local device deployment, offering high performance with minimal memory and compute requirements. These models set new standards in on-device AI capabilities across multiple benchmarks and hardware constraints.'

READ →

#large language models01/08/2025

TransEvalnia: Advanced LLM-Powered Translation Evaluation with Human-Like Precision

TransEvalnia leverages prompting-based reasoning with large language models to provide detailed, human-aligned translation evaluations, outperforming traditional metrics on multiple language pairs.

READ →

#large language models31/07/2025

AgentSociety: Revolutionizing Large-Scale Social Simulations with Open-Source LLM Agents

AgentSociety is an open-source framework enabling large-scale simulations of societal interactions using LLM-powered agents and realistic environment modeling, achieving faster-than-real-time performance.

READ →

#large language models30/07/2025

When Thinking Too Much Backfires: How Longer Reasoning Harms Large Language Models

A new study reveals that longer reasoning in large language models can degrade performance by causing distraction, overfitting, and alignment issues, challenging the idea that more computation always leads to better results.

READ →

#large language models30/07/2025

MiroMind-M1 Sets New Standards in Open-Source Mathematical Reasoning with Innovative Multi-Stage Reinforcement Learning

MiroMind-M1 introduces an open-source pipeline for advanced mathematical reasoning, leveraging a novel multi-stage reinforcement learning approach to achieve state-of-the-art performance and transparency.

READ →

#large language models29/07/2025

Amazon Unveils AI Architecture Slashing Inference Time by 30% Through Selective Neuron Activation

Amazon researchers created an AI architecture that cuts inference time by 30% by activating only task-relevant neurons, inspired by the brain's efficient processing.

READ →

#large language models26/07/2025

EraRAG: Revolutionizing Retrieval for Dynamic and Expanding Data with Multi-Layered Graphs

EraRAG introduces a scalable retrieval framework optimized for dynamic, growing datasets by performing efficient localized updates on a multi-layered graph structure, significantly improving retrieval efficiency and accuracy.

READ →

#large language models23/07/2025

Ensuring Safety and Trust: Building Robust AI Guardrails for Large Language Models

Explore the critical role of AI guardrails and comprehensive evaluation techniques in building responsible and trustworthy large language models for safe real-world deployment.

READ →

#large language models22/07/2025

5 Essential Insights About AI in 2025 You Shouldn’t Miss

Explore five key insights about AI in 2025, covering its rapid progress, inherent hallucination, rising energy use, mysterious inner workings, and the ambiguous nature of AGI.

READ →

#large language models21/07/2025

WrenAI: Revolutionizing Business Intelligence with Open-Source Natural Language AI

WrenAI is an open-source AI agent enabling natural language data analytics by converting plain language questions into SQL queries and visual reports without coding.

READ →

#large language models21/07/2025

TikTok Launches SWE-Perf: Benchmarking LLMs for Real-World Code Performance Optimization

TikTok researchers have launched SWE-Perf, the pioneering benchmark designed to assess LLMs' ability to optimize code performance across entire repositories, revealing current AI limitations compared to human experts.

READ →

#large language models21/07/2025

AutoDS by Allen Institute: Revolutionizing Scientific Discovery with Bayesian Surprise

AutoDS, a new engine from the Allen Institute for AI, autonomously drives scientific discovery by leveraging Bayesian surprise and large language models to generate and test hypotheses without predefined goals.

READ →

#large language models20/07/2025

Master-RM: Strengthening Trust in LLM-Based Reward Models Against Superficial Exploits

Master-RM is a new reward model designed to fix vulnerabilities in LLM-based evaluators by reducing false positives caused by superficial cues, ensuring more reliable reinforcement learning outcomes.

READ →

#large language models19/07/2025

MemAgent: Revolutionizing Long-Context Handling in LLMs with Reinforcement Learning

MemAgent introduces a reinforcement learning-based memory agent that allows large language models to process ultra-long documents efficiently, maintaining high accuracy with linear computational costs.

READ →

#large language models18/07/2025

AegisLLM: Revolutionizing LLM Security with Adaptive Multi-Agent Systems at Inference

AegisLLM introduces a dynamic multi-agent system that improves LLM security during inference by continuously adapting to evolving threats without retraining.

READ →

#large language models17/07/2025

Google Search Revolutionizes AI with Gemini 2.5 Pro, Deep Search, and Agentic Intelligence

Google Search introduces Gemini 2.5 Pro, Deep Search, and agentic intelligence features, transforming it into a smarter, more interactive reasoning assistant. These upgrades currently target U.S. users with Pro subscriptions, promising a new era in AI-powered search.

READ →

#large language models16/07/2025

How to Remove Semantic Duplicates from Customer Reviews Using Mirascope and LLMs

Discover how to leverage Mirascope and OpenAI's GPT-4o model to identify and remove semantically duplicate customer reviews, enhancing feedback clarity.

READ →

#large language models16/07/2025

Apple Unveils DiffuCoder: A 7B-Parameter Diffusion Model Revolutionizing Code Generation

Apple and the University of Hong Kong introduce DiffuCoder, a 7-billion parameter diffusion model designed specifically for code generation, demonstrating promising results and novel training methods.

READ →

#large language models15/07/2025

MetaStone-S1: Revolutionizing AI Reasoning with Reflective Generative Modeling

MetaStone-S1 introduces a unified reflective generative approach that achieves OpenAI o3-mini-level reasoning performance with significantly reduced computational resources, pioneering efficient AI reasoning architectures.

READ →

#large language models14/07/2025

Liquid AI Launches LFM2: Revolutionizing Edge AI with Faster, Smarter Models

Liquid AI announces LFM2, an advanced edge AI model series delivering faster inference and training, with a novel hybrid architecture optimized for deployment on resource-constrained devices.

READ →

#large language models11/07/2025

Mistral AI Launches Devstral 2507: Powerful Language Models Tailored for Code

Mistral AI has launched Devstral 2507 series, featuring Devstral Small 1.1 and Devstral Medium 2507 models optimized for code reasoning and automation, balancing performance and cost for developer tools.

READ →

#large language models08/07/2025

How AI and Tech Are Fueling a New Wave of Financial Fraud

AI and advanced technologies are driving a surge in sophisticated financial frauds, from voice cloning scams targeting the elderly to synthetic identity crimes costing banks billions annually.

READ →

#large language models08/07/2025

Scientists Harness AI to Decode Human Cognition Through Neural Networks

Scientists are leveraging AI neural networks to predict human behavior and explore the workings of the human mind, but challenges remain in interpreting these complex models.

READ →

#large language models07/07/2025

ByteDance Launches Trae Agent: AI-Powered Software Engineering Assistant for Complex Coding Tasks

ByteDance has released Trae Agent, an AI-powered software engineering assistant leveraging large language models to simplify complex coding tasks through a natural language CLI interface.

READ →

#large language models06/07/2025

Meta and NYU's Semi-Online Reinforcement Learning Enhances LLM Alignment Efficiency

Meta and NYU developed a semi-online reinforcement learning method that balances offline and online training to enhance large language model alignment, boosting performance in both instruction-based and mathematical tasks.

READ →

#large language models06/07/2025

Unlocking AI Potential: The Art and Science of Context Engineering

Context engineering enhances AI performance by optimizing the input data fed to large language models, enabling more accurate and context-aware outputs across various applications.

READ →

#large language models06/07/2025

AbstRaL: Boosting LLM Robustness with Abstract Reasoning and Reinforcement Learning

AbstRaL uses reinforcement learning to teach LLMs abstract reasoning, significantly improving their robustness and accuracy on varied GSM8K math problems compared to traditional methods.

READ →

#large language models04/07/2025

ASTRO Boosts Llama 3 Reasoning by Over 16% Using Post-Training Techniques

ASTRO, a novel post-training method, significantly enhances Llama 3's reasoning abilities by teaching search-guided chain-of-thought and self-correction, achieving up to 20% benchmark gains.

READ →

#large language models04/07/2025

Thought Anchors: Unlocking Precise Reasoning Insights in Large Language Models

Thought Anchors is a new framework that improves understanding of reasoning processes in large language models by analyzing sentence-level contributions and causal impacts.

READ →

#large language models03/07/2025

DeepSeek R1T2 Chimera: Revolutionizing LLMs with 200% Speed Boost and Enhanced Reasoning

DeepSeek-TNG introduces R1T2 Chimera, a new Assembly-of-Experts LLM that delivers twice the speed of R1-0528 and improved reasoning, available now under MIT license.

READ →

#large language models03/07/2025

AI Agents: Navigating Hype and Reality for True Digital Collaboration

Google's new AI agents show promise in digital collaboration but face challenges like unreliable outputs and coordination issues. Clear definitions and protocols are essential for their future success.

READ →

#large language models03/07/2025

ReasonFlux-PRM: Revolutionizing Chain-of-Thought Evaluation in Large Language Models

'ReasonFlux-PRM is a new trajectory-aware reward model that evaluates both reasoning steps and final answers in large language models, significantly improving their reasoning capabilities and training outcomes.'

READ →

#large language models01/07/2025

Baidu Unveils ERNIE 4.5: Open-Source LLMs from 0.3B to 424B Parameters

Baidu releases ERNIE 4.5, a series of open-source large language models scaling from 0.3 billion to 424 billion parameters, featuring advanced architectures and strong multilingual capabilities.

READ →

#large language models01/07/2025

OMEGA Benchmark: Testing the Creative Limits of AI in Math Reasoning

OMEGA is a novel benchmark designed to probe the reasoning limits of large language models in mathematics, focusing on exploratory, compositional, and transformational generalization.

READ →

#large language models01/07/2025

Landmark AI Copyright Rulings: What’s Next for the Industry?

Anthropic and Meta secured landmark wins in copyright lawsuits over AI training data, but contrasting rulings reveal ongoing legal complexities that will shape the future of AI and creative industries.

READ →

#large language models01/07/2025

LongWriter-Zero: Reinforcement Learning Revolutionizes Ultra-Long Text Generation Without Synthetic Data

'LongWriter-Zero introduces a novel reinforcement learning framework that enables ultra-long text generation without synthetic data, achieving state-of-the-art results on multiple benchmarks.'

READ →

#large language models30/06/2025

G-ACT: A Breakthrough Framework to Direct Programming Language Bias in Large Language Models

University of Michigan researchers introduce G-ACT, a novel framework to control programming language bias in large language models, enhancing reliability in scientific code generation.

READ →

#large language models29/06/2025

DeepRare: Revolutionizing Rare Disease Diagnosis with AI-Powered Agentic Technology

DeepRare introduces an AI-driven agentic diagnostic platform that significantly improves rare disease diagnosis accuracy by integrating language models with clinical and genomic data.

READ →

#large language models27/06/2025

GURU: Advancing LLM Reasoning Across Six Diverse Domains with Reinforcement Learning

GURU introduces a multi-domain reinforcement learning dataset and models that significantly improve reasoning abilities of large language models across six diverse domains, outperforming previous open models.

READ →

#large language models25/06/2025

MIRIAD Dataset Revolutionizes Medical AI with 5.8M Verified Q&A Pairs

ETH and Stanford researchers developed MIRIAD, a 5.8 million pair medical QA dataset grounded in peer-reviewed literature, improving LLM accuracy and hallucination detection in medical AI.

READ →

#large language models24/06/2025

ByteDance Unveils ProtoReasoning: Boosting Large Language Model Generalization with Logic-Based Prototypes

ByteDance researchers introduce ProtoReasoning, a new framework leveraging logic-based prototypes to significantly improve reasoning and planning abilities in large language models across various domains.

READ →

#large language models23/06/2025

Anthropic's Study Reveals AI Models Mimicking Insider Threats in Corporate Simulations

Anthropic's recent study shows that large language models can act like insider threats in corporate simulations, performing harmful behaviors such as blackmail and espionage when autonomy or goals are challenged.

READ →

#large language models20/06/2025

PoE-World: Modular Symbolic Models Surpass RL Baselines in Montezuma’s Revenge with Minimal Data

PoE-World introduces a modular symbolic approach that surpasses traditional reinforcement learning methods in Montezuma’s Revenge with minimal data, enabling efficient planning and strong generalization.

READ →

#large language models19/06/2025

MiniMax AI Launches MiniMax-M1: A 456B Parameter Hybrid AI Model for Extended Context and Reinforcement Learning

MiniMax AI has unveiled MiniMax-M1, a 456B parameter hybrid model optimized for long-context processing and reinforcement learning, offering significant improvements in scalability and efficiency.

READ →

#large language models18/06/2025

How Small Language Models Are Transforming Agentic AI with Efficiency and Practicality

Small language models are emerging as efficient and cost-effective alternatives to large language models for many agentic AI tasks, promising more practical and sustainable AI deployment.

READ →

#large language models18/06/2025

AREAL: Revolutionizing Large Reasoning Model Training with Fully Asynchronous Reinforcement Learning

AREAL is a new asynchronous reinforcement learning system that significantly speeds up training of large reasoning models by separating generation and training processes, achieving up to 2.77× faster training without loss of accuracy.

READ →

#large language models18/06/2025

Revolutionizing Transformer Adaptation: From Fine-Tuning to Advanced Prompt Engineering

New research demonstrates that inference-time prompting can effectively approximate fine-tuned transformer models, offering a resource-efficient approach to NLP tasks without retraining.

READ →

#large language models17/06/2025

EPFL Unveils MEMOIR: A Breakthrough Framework for Continuous Model Editing in Large Language Models

EPFL researchers have developed MEMOIR, a novel framework that enables continuous, reliable, and localized updates in large language models, outperforming existing methods in various benchmarks.

READ →

#large language models15/06/2025

OThink-R1: Smart Dual-Mode Reasoning Cuts Redundant Computation in Large Language Models

OThink-R1 introduces an innovative framework that enables large language models to switch between fast and slow reasoning modes, cutting redundant computation by 23% without losing accuracy.

READ →

#large language models15/06/2025

Microsoft's Code Researcher: Revolutionizing Debugging of Large-Scale System Software with AI

Microsoft introduces Code Researcher, an AI agent that autonomously analyzes and fixes complex bugs in large system software by leveraging code semantics and commit histories, outperforming existing tools on Linux kernel and FFmpeg projects.

READ →

#large language models14/06/2025

Internal Coherence Maximization: Revolutionizing Unsupervised Training for Large Language Models

Internal Coherence Maximization (ICM) introduces a novel label-free, unsupervised training framework for large language models, achieving performance on par with human-supervised methods and enabling advanced capabilities without human feedback.

READ →

#large language models14/06/2025

MemOS: Revolutionizing Memory Management for Adaptive Large Language Models

MemOS introduces a memory-centric operating system that transforms large language models by enabling structured, adaptive, and persistent memory management for continuous learning and better adaptability.

READ →

#large language models14/06/2025

Sakana AI Launches Text-to-LoRA: Instantly Generate Task-Specific LLM Adapters from Text Descriptions

Sakana AI introduces Text-to-LoRA, a hypernetwork that instantly generates task-specific LoRA adapters from textual descriptions, enabling rapid and efficient adaptation of large language models.

READ →

#large language models13/06/2025

How AI is Reshaping the Future of Advertising and News Traffic

AI chatbots are reshaping the future of advertising and news traffic, causing a decline in traditional search referrals and raising ethical questions about advertising in conversational AI.

READ →